007_002_lab_DQN 2 (Nature 2015).html
# https://www.youtube.com/watch?v=ByB49iDMiZE&list=PLlMkM4tgfjnKsCWav-Z2F-MMFRx-2gMGG&index=16
# @
# DQN (2013):
# 1. Go deep
# 1. You use replay memory storing values
# to resolve correlations between samples
# DQN (2015):
# 1. Separate networks,
# to resolve non-stationary target issue
# Core idea of DQN 2013
# img 2018-04-29 16-50-01.png
#
# Core idea of DQN 2015
# img 2018-04-29 16-50-37.png
#
# @
# How to create separated networks
# img 2018-04-29 16-51-24.png
#
# @
# DQN vs targetDQN
# img 2018-04-29 16-52-22.png
#
# @
# How to handle two networks with codes
# img 2018-04-29 16-54-35.png
#
# @
# Copying network means copying value of weights
# img 2018-04-29 16-56-59.png
#
# @
# Summary
# 1. You create 2 networks
# 1. You make target same with main network
# target=mainNet
# 1. Environment, loop, ...
# When you create y, you will use target network
# You will update main network by using y
# 1. You make target network same with main network,
# by assigning main network into target network
# @
# Code related to replay train (targetDQN added)
# img 2018-04-29 17-01-39.png
#
# @
# Code related to copy network (variable)
# img 2018-04-29 17-02-23.png
#
# Code related to bot play
# img 2018-04-29 17-02-57.png
#
# Code related to main()
# img 2018-04-29 17-03-49.png
#
# Exercise 1
# Tuning hyper parameters (learning rate, sample size, decay factor)
# Network structure
# add bias
# test tanh, sigmoid, relu, etc
# improve TF network to reduce sess.run() calls
# Reward redesign
# img 2018-04-29 17-06-37.png
#
# Exercise 2
# Car race with DQN 2015
# img 2018-04-29 17-07-20.png
#
# Exercise 3
# DQN implementtations
# Other games
# RMA approach
# img 2018-04-29 17-08-00.png
#